Every Layer Counts: Multi-Layer Multi-Head Attention for Neural Machine Translation

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Paying Attention to Multi-Word Expressions in Neural Machine Translation

Processing of multi-word expressions (MWEs) is a known problem for any natural language processing task. Even neural machine translation (NMT) struggles to overcome it. This paper presents results of experiments on investigating NMT attention allocation to the MWEs and improving automated translation of sentences that contain MWEs in English→Latvian and English→Czech NMT systems. Two improvemen...

متن کامل

Multi-channel Encoder for Neural Machine Translation

Attention-based Encoder-Decoder has the effective architecture for neural machine translation (NMT), which typically relies on recurrent neural networks (RNN) to build the blocks that will be lately called by attentive reader during the decoding process. This design of encoder yields relatively uniform composition on source sentence, despite the gating mechanism employed in encoding RNN. On the...

متن کامل

Nonsingular Green’s Functions for Multi-Layer Homogeneous Microstrip Lines

In this article, three new green's functions are presented for a narrow strip line (not a thin wire) inside or on a homogeneous dielectric, supposing quasi-TEM dominant mode. These functions have no singularity in contrast to so far presented ones, so that they can be used easily to determine the capacitance matrix of multi-layer and single-layer homogeneous coupled microstrip lines. To obtain ...

متن کامل

Stack-based Multi-layer Attention for Transition-based Dependency Parsing

Although sequence-to-sequence (seq2seq) network has achieved significant success in many NLP tasks such as machine translation and text summarization, simply applying this approach to transition-based dependency parsing cannot yield a comparable performance gain as in other stateof-the-art methods, such as stack-LSTM and head selection. In this paper, we propose a stack-based multi-layer attent...

متن کامل

A Novel Reordering Model Based on Multi-layer Phrase for Statistical Machine Translation

Phrase reordering is of great importance for statistical machine translation. According to the movement of phrase translation, the pattern of phrase reordering can be divided into three classes: monotone, BTG (Bracket Transduction Grammar) and hierarchy. It is a good way to use different styles of reordering models to reorder different phrases according to the characteristics of both the reorde...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Prague Bulletin of Mathematical Linguistics

سال: 2020

ISSN: 1804-0462,0032-6585

DOI: 10.14712/00326585.005